Analysis of rice nuclear-localized seed-expressed proteins and their database (RSNP-DB).

Interdisciplinary Centre for Plant Genomics and Department of Plant Molecular Biology, University of Delhi, South Campus, New Delhi, India. Proteomics Laboratory, Division of Plant Biotechnology, Sher-e-Kashmir University of Agricultural Sciences & Technology of Kashmir, Shalimar, Srinagar, Jammu & Kashmir, India. Institute of Informatics and Communications, University of Delhi, South Campus, New Delhi, India. Interdisciplinary Centre for Plant Genomics and Department of Plant Molecular Biology, University of Delhi, South Campus, New Delhi, India. akhilesh@genomeindia.org.

Scientific reports. 2020;(1):15116

Abstract

Nuclear proteins are primarily regulatory factors governing gene expression. Multiple factors determine the localization of a protein in the nucleus. An upright identification of nuclear proteins is way far from accuracy. We have attempted to combine information from subcellular prediction tools, experimental evidence, and nuclear proteome data to identify a reliable list of seed-expressed nuclear proteins in rice. Depending upon the number of prediction tools calling a protein nuclear, we could sort 19,441 seed expressed proteins into five categories. Of which, half of the seed-expressed proteins were called nuclear by at least one out of four prediction tools. Further, gene ontology (GO) enrichment and transcription factor composition analysis showed that 6116 seed-expressed proteins could be called nuclear with a greater assertion. Localization evidence from experimental data was available for 1360 proteins. Their analysis showed that a 92.04% accuracy of a nuclear call is valid for proteins predicted nuclear by at least three tools. Distribution of nuclear localization signals and nuclear export signals showed that the majority of category four members were nuclear resident proteins, whereas other categories have a low fraction of nuclear resident proteins and significantly higher constitution of shuttling proteins. We compiled all the above information for the seed-expressed genes in the form of a searchable database named Rice Seed Nuclear Protein DataBase (RSNP-DB) https://pmb.du.ac.in/rsnpdb . This information will be useful for comprehending the role of seed nuclear proteome in rice.